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Abstract — In this paper we show that the Index Coding 
problem captures several important properties of the more 
general Network Coding problem. An instance of the Index 
Coding problem includes a server that holds a set of infor- 
mation messages X — {xi, . . . ,x k } and a set of receivers R. 
Each receiver has some side information, known to the server, 
represented by a subset of X and demands another subset of X. 
The server uses a noiseless communication channel to broadcast 
encodings of messages in X to satisfy the receivers' demands. The 
goal of the server is to find an encoding scheme that requires 
the minimum number of transmissions. 

We show that any instance of the Network Coding problem 
can be efficiently reduced to an instance of the Index Coding 
problem. Our reduction shows that several important properties 
of the Network Coding problem carry over to the Index Coding 
problem. In particular, we prove that both scalar linear and 
vector linear codes are insufficient for achieving the minimal 
number of transmissions. 

I. Introduction 

Since its introduction by the seminal paper of Ahlswede et 
al. [1], the network coding paradigm has received a significant 
interest from the research community (see e.g., [2], [3] and 
references therein). Network coding extends the functionality 
of the intermediate network nodes from merely copying and 
forwarding their received messages to combining the informa- 
tion content of several incoming messages and forwarding the 
result over the outgoing edges. The network coding approach 
was shown to produce substantial gain over the traditional 
approach of routing and tree packing in many scenarios. 

The Index Coding problem has been recently introduced 
in [4] and has been the subject of several studies [5], [6], 
[7]. An instance of the Index Coding problem includes a 
server/transmitter that holds a set of information messages 
X and a set of receivers R, each one of them has some 
side information represented by a subset of X, known to 
the server, and demands another subset of X. The server 
can broadcast encodings of messages in X over a noiseless 
channel. The objective is to identify an encoding scheme that 
satisfies the demands of all clients with the minimum number 
of transmissions. 

Figure [T] depicts an instance of the Index Coding problem 
that includes a server with four messages xi, . . . ,X4 € {0, 1} 
and four clients. For each client, we show the set of messages 
it has (side information), and the set of messages it wants 
(demands). Note that the server can always satisfy the demands 
of the clients by sending all the messages. However, this 
solution is suboptimal since it is sufficient for the server to 
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Fig. 1 . An instance of the Index Coding problem. 



broadcast the two messages x\ + x-i + x% and x\ + x± (all 
operations are over GF(2)). This shows that by using an 
efficient encoding scheme, the server can significantly reduce 
the number of transmissions, and, in turn, reduce the delay 
and the energy consumption. 

In general, each message can be divided into several packets 
and the encoding scheme can combine packets from different 
messages to minimize the number of transmissions. With 
linear index coding, all packets are considered to be elements 
of a certain finite field F and each transmitted packet is a 
linear combination of the packets corresponding to the original 
messages in X. The linear solutions can be further classified 
into scalar linear and vector linear. With a scalar linear 
solution, each message corresponds to exactly one packet, 
while with a vector linear solution each message can be 
divided into several packets. Note that the example shown in 
Figure Q] uses a scalar linear solution over F — GF(2). 

The Index Coding problem was studied from an information 
theoretical perspective in [5]. The authors of [4] established 
lower and upper bounds on the minimum number of trans- 
missions based on the properties of a certain related graph. 
References [6] and [8] present several heuristic solutions 
for this problem. In addition, the authors of [7] showed 
the suboptimality of scalar linear encoding schemes, which 
disproves the conjecture of [4]. 



Contributions 

Index coding can be seen as a special case of the Network 
Coding problem. In this paper, we show that, nevertheless, 
several important properties of the more general Network 
Coding problem carry over to the Index Coding problem. To 
that end, we present a reduction that maps any instance of 
the Network Coding problem to a corresponding instance of 
the Index Coding problem. We use this reduction to establish 
several fundamental properties of the Index Coding problem. 

First, we show that a scalar linear solution may require more 
transmissions than a vector linear one. In particular, we show 
two instances of the Index Coding problem in which a vector 
linear solution that divides each message into two packets 
yields a smaller number of transmissions than a scalar linear 
solution. 

Second, we show that even vector linear solutions for 
the Index Coding problem are insufficient for achieving the 
minimal number of transmissions. In particular, we use our 
reduction and the construction presented in [9] to show an 
instance of the Index Coding problem for which a non-linear 
solution requires a lower number of transmissions than the 
linear one. 

II. Model 

A. Network Coding 

Let G(V, E) be a graph with vertex set V and edge set E. 
For each edge e(u, v) G E, we define the in-degree of e to be 
the in-degree of its tail node u. Similarly, we define the out- 
degree of e to be the out-degree of its head node v. Let S C E 
be the subset of edges in E of zero in-degree and let D C E 
be the subset of edges in E of zero out-degree. We refer to 
edges in S and D as input and output edges, respectively. 
We denote m — \E\, k — \S\, d = \D\, and assume that 
the edges in E are indexed such that S = {ei, . . . , e^} and 
D = {e m -d + i, . . . , e m }. Also, for each edge e = (u, v) G E, 
we define V(e) to be the set of the parent edges of e, i.e., 
P(e) = {(«;,«); {w,u) e E)}. 

We represent a communication network by a 3-tuple 
N{G(V, E),X, 5) defined by an acyclic graph G(V, E), a mes- 
sage set X = {x\, . . . ,Xk}, and an onto function S : D — ► X 
from the set of output edges to the set of messages. Each 
message Xi G X consists of a vector of n packets Xi = 

(xn : ■ ■ • , Xi n ). 

We assume that the message Xi is available at the tail node 
of the input edge e,. The function 6, referred to as the demand 
function, represents, for each output edge e, G D, the message 
demanded by its head node. 

Definition 1 (Network Code): Let N(G(V,E),X,S) be an 
instance of the Network Coding problem with k = \X\ 
messages, each message is a vector of n packets, Xi = 
(xn, . . . , Xi n ) G E n , where E = {0, . . . , q — 1} is a q-ary 
alphabet. Then, an (n, q) network code of block length n is a 
collection 

G = {f e = {fl ■ ■ .,/?);/* : (E") fe ► S,eeE,l<i<n} 

of global encoding functions, indexed by the edges of G, that 
satisfy the following conditions: 



(Nl) f ei (X) = x % for i = l,...,k; 

(N2) f ei (X) = S(ei) for i = m - d+ 1, . . . ,m; 

(N3) For each e = (u,v) G E\S with V(e) = {d,.. . ,e Pe }, 
there exists a function <p e : (E™) Pe — ► E n , referred 
to as the local encoding function of e, such that 
f e (X) = Mf ei (X),..., f epe (X)), where Pe is the in- 
degree of e and V(e) is the set of parent edges of e. 
If n = 1, the network code is called a scalar network code, 

otherwise (if n > 1) it is called a vector or a block network 

code. If E is a certain finite field F, and all the global and 

local encoding functions are linear functions of the packets, 

the network code is called linear over F. 

B. Index Coding 

An instance of the Index Coding problem T(X, R) includes 

1) A set of messages X — {x\, . . . , Xk}\ 

2) A set of clients R C {(x, H); x G X, H C X \ {x}}. 
Here, X represents the set of messages available at the server. 
A client is represented by a pair (x, H) G R, where x is the 
message required by the client, and H C X is set of messages 
available to the client as side information. We assume, without 
loss of generality, that each client needs exactly one message. 

As in the Network Coding problem, each message Xi G X 
is divided into n packets Xi = (xn, . . . , Xi n ). We refer to 
parameter n as the block length of the index code. 

Definition 2 (Index Code): Let I(X, R) be an instance of 
the Index Coding problem with k = \X\ messages, each 
message Xi is a vector of n packets, (xn, . . . , Xi n ) G E", 
where E = {0, . . . , q— 1} is a q-ary alphabet. Then, an optimal 
(n, q) index code for I(X, R) is a function / : (E") fc — > E £ , 
such that 

(11) for each client r = (x, H) G R, there ex- 
ists a function : E £+ "I H I — > E" such that 

1p r (f(xi, . . . ,X k ),Xi\ ieH ) = x, 

(12) i — £(n, q) is the smallest integer such that (II) holds for 
the given g-ary alphabet and block length n. 

We refer to i[) r as the decoding function for client r. With a 
linear index code, the alphabet E is a field and the functions 
/ and ip r are linear in variables Xij. Similarly, if n = 1 the 
index code is called a scalar code and for n > 1 it is called a 
vector or block code. 

Our formulation of the Index Coding problem here differs 
from that of [4] and [7] in two aspects. First, the model of [4] 
and [7] assumes that for each message in X there is exactly 
one client that requests it. Our model does not make this 
assumption. Second, and more importantly, [4] and [7] focus 
on scalar linear codes (vector linear codes are mentioned in 
the conclusion of [7]), whereas we consider the more general 
case of vector linear codes. 

Let T(X, R) be an instance of the Index Coding problem. 
We define by X(n, q) = £(n, q)/n the transmission rate of the 
optimal solution over an alphabet of size q. We also denote by 
A* (n, q) the minimum rate achieved by a vector linear solution 
over a finite field ¥ q . We are interested in the behavior of A 
and A* as functions of n and q. 

Let be the largest set of messages requested by a 

collection of clients with identical "has" sets, i.e., fi(l) = 



maxjcx\{xi; (xi,I) £ R}\. It is easy to verify that the 
optimal rate A(n, q) is lower bounded by fi(T), independently 
of the values of n and q. 

Lemma 3: For any instance T(X, R) of the Index Coding 
problem it holds that 

X(n,q) > 

III. Main Result 

In this section we present a reduction from the Network 
Coding problem to the Index Coding problem. Specifically, 
for each instance N(G(F, E), X, 5) of the Network Coding 
problem, we construct a corresponding instance 2n of the 
Index Coding problem such that has an (n, q) index code 
of rate X*(n, q) — X(n, q) = if and only if there exists 

an (n, q) linear network for N. 

Definition 4: Let N(G(V,E),X,S) be an instance of the 
Network Coding problem. We construct an instance 2^(Y, R) 
of the Index Coding problem as follows: 

1) The set of messages Y includes a message for each 
edge e € E and the messages Xi G X, i.e., 
Y = { yi ,...,y m }UX; 

2) The set of clients R is a union of R\ , . . . , R& defined 
as follows: 

a) Ri = {(xi,{yi});ei £ S} 

b) R 2 = {(y i} {xi}); e< £ S} 

c) R 3 = {{y it { yj ; ej G V(et)}); e 4 £ £ \ S} 

d) fi 4 H(%),{y,});e,efl} 

e ) #5 = {{Vi,X);i = l,...,m} 

It is easy to verify that for instance Tfi(Y, R) it holds that 

/u(Jn) = m. 

TheoremS: Let N(G(V, -E 1 ), X, J) be an instance of the 
Network Coding problem and let 2^ (Y, R) be the correspond- 
ing instance of the Index Coding problem as defined above. 
Then, there exists a linear (n, q) index code for 2pj with 
X*(n,q) = fJ.(Tfi) iff there exists a linear (n, q) network code 
for N. 

Proof: Suppose there is a linear (n, q) network code 
C = {fe(X); f e : (F") fc -> F™, e £ £} for N over the finite 
field F g for some integer n. 

Define g : (F") m+fc -> (F 9 l ) m such that 

= (a: 1 ,...,a; fc ,ift,... ) y m ) £ (F£) m )fl (Z) = 
( gi (Z),...,g m (Z)) with 

9i(Z) = yi + Xi i = l,...,k, 

gi(Z) = yi + fei(X) i = k + l,...,m, 

Next, we show that is in fact an index code for 2n 

by proving the existence of the decoding functions tp r . We 
consider five cases: 

1) \/r = (x^ {yi}) G R\,ij) r = gi(Z) - y h 

2) Vr = (yi, {xi}) G i?2, = 9i(Z) - x», 

3) Vr = (j/i, {%!,..., G i? 3 , since C is a linear 
network code for N, there exists a linear function 4> ei 
such that / ei (X) = ^(/^(X),...,/ (X)). Thus, 
Vv = - (t> ei (9ix(Z) - y 21 , . . -,9i v {Z) - y ip ), 

4) Vr = (<5(e;), £ i? 4 , e 2 £ £>, ip r = g,(Z) - y u 



5) Vr=(y i ,X)eR 5 ,ip r = g i (Z)-f ei (X). 
Now assume that g : (F") m+fc — > (F") m is a lin- 
ear (n, q) index code for over the field F 9 , such that 

VZ = ( Xl ,...,x k , yi ,...,y m ) £ (F£)™+ fc , = 
(ffi(Z), . . .,g m (Z)), x % ,y t ,g t (Z) £ F™. We write 

k m 

9i{Z) = X + zl 11 ■>'''' r 

3 = 1 3=1 

where i = 1, ...,m and £ (n, n) are sets of 

n x n matrices with elements in ¥ q . 

The functions tp r exist for all r £ i?5 iff the matrix 
M = [Bij] £ Mf (nm,nm), which has the matrix B^ as 
a block submatrix in the (i,j)th position, is invertible. Define 
h : (F") m+fc — > (F") m , such that h{Z) = g(Z)M~ l ,\/Z £ 
(F™) m+fc . So, we obtain 

k 

hi(Z) =Ui+/ J XjCij,i = 1, . . . , m, 

3=1 

where [Cy] £ (n, n). We note this is a valid index 

code for I®. In fact, Vr = (x,H) £ i? with ip r (g, x\ x< zh) = 
x, ip' r = (h, x\ X £h) = ipr(hM,x\ xe H)) is a valid decoding 
function corresponding to the client r and the index code h(Z). 

For all r £ i?i U i?4, ^ exists iff for i = 1, . . . , k, m — 
d + 1, . . . ,m and j ^ i it holds that Cy = [0] £ Mf (n, n) 
and Cm is invertible, where [0] denotes the all zeros matrix. 
This implies that 

hi(Z) = yi + XiCa, i = 1, . . . , k 

k 

hi(Z) = y { ■ ■'',(', j- i = k + 1, . . . ,m - d (1) 

3 = 1 

hi(Z) =yi + 5(ei)Cu, i = m - d + 1, . . . , m 

Next, we define the functions f Ci : (F£) fe — ► F^e, £ E as 
follows: 

1) f ei (X) = xu for % = l,...,k 

2) f ei (X) = J2j=i x 3 C i3> fori = k + l,...,m-d 

3) f ei (X) = 5(e.i), for i = m — d + 1, . . . , m. 

Then C = {/ ei ; e, £ i?} is a linear (n, g) network code for 
N. To show that it suffices to prove that condition N3 holds. 

Let ei be an edge in E \ S with the set V(ei) = 
{ejj , . . . , ei p } of parent edges. We denote by Jj = {ii, . . . , i p } 
and r^ = (y h {y ix , . . . , y ip }) £ R 3 . Then, there is a linear 
function ip' r , such that = %j}' r . (hi, . . . , h m , y i± , . . . , y ip ). 
Hence, there exist Ty-,T/ a £ Mf q (n,n) such that 

m 

3=1 a£J< 

Using Eq. ([TJ, we get that T„ is the identity matrix, T( a = 

-T ia Ma £ I it Tij = [0] Vj ^ I 2 U {«}. Therefore, 

/ e< = - fe a T ia ,\/ei e E\S, 
and C is a feasible network code for N. ■ 



Lemma 6: Let N(G(V, E),X, S) be an instance of the Net- 
work Coding problem and let Tn(Y, R) be the corresponding 
index problem. If there is an (n, q) network code (not neces- 
sarily linear) for N, then there is a (n, q) index code for In 
with A(n, q) — jt/(2k) = m, where m = \E\. 

Proof: The proof can be obtained by slightly modifying 
the first part of the proof of Theorem [5] ■ 

IV. Applications 

A. Block Encoding 

Index coding, as noted in [4], [7], is reminiscent of the zero- 
error source coding with side information problem discussed 
by Witsenhausen in [10]. Two cases were studied there de- 
pending on whether the transmitter knows the side information 
available to the receiver or not. It was shown that in the former 
case repeated scalar encoding is optimal, i.e. block encoding 
does not provide any benefit. We will demonstrate in this 
section that this result does not always hold for the Index 
Coding problem which can be regarded as an extension of the 
point to point problem discussed in [10]. 

Let Afi be the M-network introduced in [11] and depicted 
in Figure |2 a). It was shown in [12] that this network does 
not have a scalar linear solution, but has a vector linear one 
of block length 2. Interestingly, such a vector linear solution 
does not require encoding. In fact, reference [12] proves a 
more general theorem: 

Theorem 7: The M-network has a linear network code of 
block length n iff n is even. 

Next, we present another network A2, that we refer to as 
the non-Pappus network, and that has the same property as the 
M-network. Both of these networks will be used to construct 
two instances of the Index Coding problem where vector linear 
outperforms scalar linear coding. 

Definition 8 (non-Pappus Network): Let Sq = 
{{1, 2, 3}, {1, 5, 7}, {3, 5, 9}, {2, 4, 7}, {4, 5, 6}, {2, 6, 9}, 
{1,6, 8}, {3, 4, 8}}, and Si = {I C {1, 2,. . . , 9}; \I\ = 
3} \ So- The non-Pappus network A2 is obtained by adding 
to the network depicted in Figure |2jb) a node m for each 
I = {i,j, k} € Si, the edges n/), (nj, nj), (n k , n/) 
and three output edges outgoing from m, each one of them 
demands a different Xi. 

Theorem 9: There is no scalar linear network code for the 
non-Pappus network over any field, but there is a (2, 3) linear 
one. 

Proof: Let C — {/ e ;e G A/2} be a scalar linear 
network code for A/2 over a certain field F. Without loss 
of generality, we assume that for each node rii of A/2, the 
functions associated with its output edges are identical. We 
define then fa — f e where e is an outgoing edge to rii, 
i = 1, . . . ,9, and write = auxi + ^2X2 + ^3X3 = ■ X T , 
where X = (xi, x 2 ,x 3 ) and a* = (an, a i2 , ^3). 

Since VI = k} G Si, the outgoing edges to node 
ni demand Xi, x 2 and x 3 , we have rank{a.;, aj, a^} = 3. 
Furthermore, from the connectivity of A/2, we deduce that 
a 2 should be a linear combination of ai and 03, giving 
rankjai, a 2 , 0,3} < 3. But rankjai, 02, 04} = 3, which 




Fig. 2. (a) The M-Netwrok A/i. (b) A subnetwork of the non-Pappus network 
A/" 2 . 




Fig. 3. A graphical representation of the non-Pappus matroid of rank 3 [13, 
p. 43]. Cycles are represented by straight lines. 



implies that rank{a l7 a 2 , 03} > 1, hence rank{ai, a%, 03} = 
2. Similarly, V{z, j,k} G So,raitk{aj, o-k} = 2. 

Therefore, letting A = {ai, a 2 , ■ . ■ , 09}, the matroid 
A4(A, rank) is the non-Pappus matroid shown in Figure[3][13, 
p.43]. Therefore, the vectors m form a linear representation 
of M. over F. But, by Pappus theorem [13, p. 173], the non- 
Pappus matroid is not linearly representable over any field, 
which leads to a contradiction. So, A/2 does not have a scalar 
linear solution. 

Let Xi = [x, y),x 2 = (w,z),X3 — (u,v) G F|. Define 
fi(X)=xi,f 2 (X) = (x + w,y + z),f 3 (X) = x 2 ,U(X) = 
(x + u + 2z,y + 2v + w + z),f 5 {X) = x 3 J 6 (X) = (x + 
2u + 2v + 2z, y + u + w + z), h{X) = (x + v, y + u + 
2v), f 8 (X) = (x + u + w + z,y + 2v + w)J 9 (X) = (u + 
w,v + z). These functions correspond to the multilinear (or 
partition) representation of the non-Pappus matroid discussed 
in [14], [15]. For each edge e G G outgoing from node rij, i — 
1, . . . , 9, define f e = And for each edge e G D, let f e = 
8(e). Then, {/ e ;e G A/2} is a (2,3) network code for the 
non-Pappus network. ■ 

Now, consider 1^ and the two Index Coding problems 
corresponding respectively to the M-network and the non- 
Pappus network obtained by the construction of the previous 
section. Both do not admit scalar linear index codes that 
achieve the bound of Lemma [3] but have linear index codes 



of length 2, I/vi over ^2 and J/v 2 over F3, that meet this 
bound. Thus, and Jjv 2 are two instances of the Index 
Coding problem where vector linear coding outperforms scalar 
linear coding. This result can be summarized by the following 
corollary: 

Corollary 10: For 2^, A*(2,2) < A* (1,2). And for 
JjVa, A* (2, 3) < A* (1,3). 

Proof: Follows directly from Theorems [5] [7] and [9] ■ 

S. Linearity vs. Non-Linearity 

Linearity is a desired property for any code, including index 
codes. It was conjectured in [4] that binary scalar linear index 
codes are optimal, meaning that A* (1,2) = A(l,2) for any 
instance of the Index Coding problem. The authors of [7] 
disproved this conjecture for scalar linear codes by providing 
for any given number of messages k and field ¥ q , an instance 
of the Index Coding problem with a large gap between A*(l, q) 
and A(l, q). 

In this section we show that vector linear codes are in- 
sufficient for minimizing the number of transmissions. In 
particular, we prove that non-linear index codes outperform 
vector linear codes for any choice of field and block length 
n. Our proof is based on the insufficiency of linear network 
codes shown in [9]. 

First, we present the network A3 depicted in Figure |4] which 
was introduced and studied in [9]. The following theorem was 
proved in [9]. 

Theorem 11: The network A3 does not have a linear solu- 
tion, but has a (2,4) non-linear solution. 

Let I7V3 be an instance of the Index Coding problem that 
corresponds to A3 constructed according to Definition |4] 
Theorem QT| implies that Ijj 3 does not have a linear solution 
that achieves fi(Ijj 3 )> the lower bound of Lemma [3] However, 
by Lemma [6] the (2, 4) non-linear code of A3 can be used to 
construct a (2, 4) non-linear index code for X/v 3 that achieves 
the lower bound of lemma [3] Hence, we obtain the following 
result: 

Corollary 12: For the instance 2jv" 3 of the Index Coding 
problem it holds that A(2,4) = /i(X/v 3 ). Furthermore, for any 
positive integers n and q, it holds that A*(n, q) < \(n,q). 

V. Conclusion 

In this paper we studied the connection between the Index 
Coding and Network Coding problems. We showed a reduction 
that maps each communication network N to an instance of 
the Index Coding problem 2n such that N has a linear network 
code if and only if 2pj has a linear index code over the 
same field that satisfies a certain condition on the number of 
transmissions. 

This reduction allowed us to apply many important results 
for network coding to index coding. For instance, we intro- 
duced the non-Pappus network and showed that it does not 
have a scalar linear network code, but has a vector linear one. 
The non-Pappus network in addition to the M-network of [1 1] 
were used to construct index coding instances where vector 
linear solutions outperform scalar linear solutions. Another 
application of this reduction concerns the comparison of linear 




X3 X2 X1 X3 X2 X1 X3 X4 X5 X3 



Fig. 4. The network A3 of [9]. A3 does not have a linear network code 
over any field, but has a non-linear one over a quaternary alphabet. 



and non-linear index codes. Using the results of Dougherty et 
al. in [9] we proved the insufficiency of vector linear solutions 
for the Index Coding problem. 
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